The ‘Update’

Summary
This week I took the feedback from the class to improve my portfolio.

Last class my portfolio was shown. Despite the positive remarks, quite some points of improvement were mentioned as well. So, as I felt that dealing with these points instead of creating further plots will for now increase the clarity and quality of my portfolio more, I went on to that for this weeks update.

To summarise the things I have done this week:

  • I changed my initial Data Exploration plots from scatterplots to violin plots to convey more information about the data in a glance. The changes can be found at The Data.
  • I normalised the range of the magnitude of the cepstrograms to all be between 0 and 0.8, this way spots with the same brightness will have the same magnitude thus allowing better comparison.
  • As mentioned in class Knockout showed a very interesting pattern in its self similarity matrix; not only a diagonal line from bottom left to top right, but also some vague lines going from top left to bottom right. In the next story I am going to find out where this comes from.
  • When looking at the cepstrograms I found that c03 corresponded with someone singing. c04, however, has only one bright spot in the cepstrogram of Lake of Fire. When listening to this song I heard that it corresponded with a very unique part in the song. This part sounded quite familiar to me, I had heard this in Darkest Hour of the Clock from D-Block & S-te-Fan! To see whether this part is indeed present in that song as well I made a cepstrogram and self similarity matrix for that song as well. This may indicate a unique feature for D-Block & S-te-Fan, but if it is useful for classification is to be seen. (The analysis can be seen two stories further)

I also wanted to perform VariMax Rotation on the Principal Components however due to some techical difficulties with R not enough time for this was left.

The symmetry of Knockout
I dive on the interesting pattern found in the self similarity of Knockout.


When my portfolio was shown in class Ashley mentioned seeing something unqiue in the self similarity matrix of Knockout. To show this more clearly I changed the colors of the cepstrograms and self similarity matrices to a more contrastive colorscheme.

We are now able to destinguish not only the standard diagonal line going from bottom left to top right in the cepstrogram of Knockout, but also a more vague line going from top left to bottom right. This means that not only is there similarity if you would play this track in two instances at once (pretty obvious), but also if you would play one of the instances in reverse.

We see that this line does not start and end exactly in the corners, which can be explained by the song having an intro that is different from its outro, but still spans quite a lot of the song. When listening to the song I heard very similar parts. That is why plotted the cepstrogram of the song again. We see that the song is very symmetrical, you would be able to put a vertical line around 80 seconds to divide the cepstrogram in two very identical parts. As expected we see that the first 20 and last 30 seconds do differ from each other explaining the inverse similarity line being in the top left and bottom right corners.

Looking at the cepstrogram made me curious whether I would be able to see this symmetricity from the actual audio file or even find this point. So I downloaded the song and put it in Audacity (audio editing software). First I loaded the song in regular playback then I imported the song again and reversed it. The result can be seen in the image below the cepstrogram.

We see that this song is indeed very repetitive in that it has the same pattern twice, with the first occurence of this pattern being slightly longer than the second occurence. This pattern contains the following parts:

1 Some singing without much extras. 2 Repetition of this singing but now with added drums. 3 (Short) Buildup. 4 Drop.

Both occurences of the song contain the same lyrics and melodies thus meaning that this pattern can be repeated forever. The possibility of inversing the song comes from parts 1 and 2 being exactly the same apart from some added drums, that just follow the rithm of this part and thus only increases the intesity of the beats.

When two drops then are aligned this song will sound kinda similar to another instance of this song played in reverse. Only the lyrics do sound kinda strange since the voice is reversed as well, resulting in an unpleasant experience to the ear and the reason I have not included a link to the combination of these instances.

Whether this pattern of a song is common use for Da Tweekaz needs to be seen, but at least this is an interesting feature that might be useful for later research.

Darkest Hour of the Clock
I further analyse interesting patterens from previous week and bring out another song for this.


As I mentioned in the summary of this weeks update I saw an interesting feature in the cepstrogram of Lake of Fire: There is one hotspot at c04.

When listening to the song at especially this part of the song I found that this part corresponds to a very unique piece of the song, just a wall of noise… I did however recocognise this part as it also occurs in another song of D-Block & S-te-Fan: Darkest Hour of the Clock. In this song the same part occurs around 2 minutes for a longer duration and more intense and shorter around 3 minutes. I thus was interested whether I would be able to see two hotspots at c04 in this song as well.

For comparison I first plotted the cepstrogram of Lake of Fire and we indeed see this very bright spot at 50 seconds into the song at c04. If we now look at the c04 row of Darkest Hour of the Clock we again see bright spots. If we look closely we see that these are indeed around the times I expected them to be! This thus means that this is quite a unique feature for this artist! However, only in these songs… Because if we take a look at the cepstrogram of Fallen Souls we don’t see any bright spots in the c04 row at all.

There might still be a chance that the model I will create in the upcoming weeks will notice these features and at least will be able to classify a song with this features as being produced by D-Block & S-te-Fan.

Introduction

Column

The Idea

For my project I’m going to do research in the genre of hardstyle music. A lot of people would say that the music within this genre is all alike. However there is a common assumption that each artist distinguishes him- (or her)self with his (or her) unique style and sound. This is most noticable in the tones used in the so called drop and as bass-kick.

I’m going to research whether this assumption can be proved with a (computer)model. In particular a classification model that can classify a song with an artist (assumed that this song is of one of the artists used for training the model). Because if such model can be used we can assume that indeed there is something in the songs that are unique for each artist. However if such model is not possible, I’m going to research why this is the case, or what is necessary to create such model.

Column

The Corpus

To do this research we obviously need some data to work with. For this I’m going to use the songs from the top 5 hardstyle artists together with the songs of my 2 most favorite artists. Together this gives me a corpus of 698 songs where each artist has about 50 songs or more. This should be enough data to build a decent classification model.

Artist Songs on Spotify
Noisecontrollers 199
Headhunterz 146
Brennan Heart 100
Showtek 88
Da Tweekaz 62
Sub Zero Project 56
D-Block & S-te-Fan 47

The Data

Data Understanding
In this section I do a quick exploration of the data. I look if there may be already some patterns.


Data Understanding

Before trying to build a classifier we first need to do some exploration on and understanding of the data. In the first place we need to decide which information we are going to use for the classifier. For example the genre for each artist probably will be similar and thus will not be useful data. Furthermore we have two possible sets of features we can use to train the classifier with:

  • Track Features, these features are returned by the get_track_audio_features() method of the spotifyr package. This method is also used to get the track features in the get_artist_audio_features() and get_album_audio_features() methods. These features are values that say something about the song in a whole, thus we will get 1 feature value per song.
  • Track Analyis, these features are obtained using the get_track_audio_analysis() method fromt the spotifyr package. The analysis features are quite a bit more extensive than the track features, thus will probably contain a lot more information about the song. However this means more data, which will take up more disk space, take longer to obtain from Spotify, make the classifiation training take more time and make the model quite a bit more complex (since we now need to add a time dimension to our model).

Because of the reasons described above I’m first going to focus on creating a model created with the track features. If I fail to create a good model with these features I’m going to take a look at the track analysis features.

The track features include many features including the following numeric features that may be useful:

  • Danceability
  • Energy
  • Key
  • Loudness
  • Mode
  • Speechiness
  • Accousticness
  • Instrumentalness
  • Liveness
  • Valence
  • Tempo

This is a lot of data in which some features may be very similar for all songs. It is useless to include this data in the trainingsdata for the classifier since it wouldn’t provide good information to distinguish two songs from each other, let alone different artists.

To give a good insight in these features and get a quick overview of which of these may be show some clear differences between the artists, I have combined all songs from all artists into one dataset. I have plotted each of them here (I would have made them interactive however due to limitations of the plotly package they are just images. From these plots you can observe that the Mode feature is not useful. Some other features don’t seem to show any clear patterns on their own too. That is why I have plotted each feature relative to the other features. If one of these plots already show clusters we probably only need to use these two features to train a classifier with. However I have put these plots in a separate document because of the amount (121) and because no plot seems to show any clear clusters.

So although the scatterplots did seem to show some nice patterns, these patterns seem to be very similar among the artists. This thus means we need to perform Principal Component Analysis. I will elaborate more on this in the Data Preparation section. If the PCA provides us with good clusters we know that we can quite easily build a classifier, however if the PCA doesn’t provide us with any noticable clusters it may be possible that the data still can be clustered, but in higher dimensions. This, however, is quite hard to visualize, thus then I will probably just feed the data to the classifier and hopefully it will be able to draw relations between the features and the artist.

Data Preparation (Dimensionality Reduction)
Not all data is useful for a classifier, therefore we need to reduce the dimensionality of our dataset.


Dimensionality Reduction

Before we can feed the data to the classifier we first need to prepare the data. One part of data preparation is data reduction. This means that we reduce the initial dataset to be only data we are going to use for the classifier. Since we are going to use the track features we can all discard all data other than these features. I combined this data only into a new data frame and saved that as my new corpus. Previously I mentioned removing the Mode feature from our data as well, however since we are now going to perform PCA I will keep this feature for now. Of course we still need to include the Artist in our reduced dataset since we need to use that data as classes for the model.

Principal Component Analysis

As mentioned in the Data Understanding we need to apply Principal Component Analysis on the data since the features on their own or relative to one other feature didn’t show any good clusters. Principal component analysis means that we reduce the data to a new dataset where each column is an information rich column that captures as much possible variation from the initial data. This data may be even better to use than the features on their own since the PCA data will be more dense in information, and will contain only the relevant parts of the features. First of all I made a PCA of the data, I have plotted the first two against each other (see Figure 1).

As you can see, there is still no clear separation between the artists, if any we see that they all seem to be very close in terms of PC1 and PC2. However since the PCA consists, in our case, of 11 principal components, we might not see all differences between the artists. Thus we need a way to see whether all principal components can be used to desinguish the artists.

K-Means Clustering

We can use clustering for this and k-means clustering in particular. This clustering algorithm first assigns each data point to a random cluster. Then it iteratively calculates the centerpoint of each cluster and assigns each data point to the cluster with the nearest centerpoint, this is done using some distance formula that handles multidimensional data (like Euclidean distance, Manhattan distance). To compare the standard features with the principal components I have applied k-means clustering on both the features data and principal components data with 7 clusters (we have 7 artists).

If the data can be effectively clustered we would see 7 separate circles, and in each cirlce there will be only one type of points. However, as you can see in Figure 2 and Figure 3 (can be found interactive in the next to stories) most data points are all in one big cluster and the circles of all clusters all cross each other. We do see that using the Principal Components as data for the clustering algorithm did help to separate one cluster from all others. But if we examine this cluster we see that 6 out of the 7 artists have songs in this cluster.

[1] "Noisecontrollers"   "Headhunterz"        "Brennan Heart"     
[4] "Da Tweekaz"         "Sub Zero Project"   "D-Block & S-te-Fan"

Figure 2 (interactive)

Figure 3 (interactive)

Data Preparation (Subsetting)
Before we can train a classifier we need training- and testingdata. To reduce data loss I use cross validation.

(Upcoming) Subsetting the data for the classifier

Since we are going to train a classifier we also need to seperate the data into two subsets:

  • A trainingset, containing about 80% of the data. This data will be used to train the classifier.
  • A testset, containing the remaining 20% of the data. This data will be used to test the classifier.

However, since taking only 20% of the data as validation data, this means we get only 140 songs to validate the classifier. In the best scenario this will mean 20 songs per artist, however since we don’t have an equal amount of songs per artist we will most likely not get 20 songs per artist in the validation data.

For this reason I’m going to perform cross validation on the data with 5 batches. This means that I’m going to shuffle the data and then divide the data into 5 parts (thus each part is 20% of the data). After that I will repeatedly take one part as testset and the other parts as trainingset. This way all data will be once testdata and 4 times trainingsdata, resulting in a model that has seen more data and thus is less overfitted on that data, hopefully giving a model that can classify novel songs better.

Testing on Novel Data

Novel Data
I gather the songs that are released after I made my initial corpus, to use as novel test data.

Novel Data

As mentioned earlier, I want to test my models and techniques on novel data; songs that have been released by one of the artists of my corpus after I made my initial corpus. A few songs have been released already. I have put some of them below:

Artist Track Name
D-Block & S-te-Fan Lake Of Fire
D-Block & S-te-Fan Feel It!
D-Block & S-te-Fan Fallen Souls
Brennan Heart Born & Raised
Da Tweekaz Power Of Perception
Da Tweekaz Knockout
Da Tweekaz We Made It

Song Similarity using Chroma Features
I compare some songs by comparing their chromagrams.

One way to maybe see to whether two songs belong to the same artist would be by using chromagrams. From the list of novel songs we see that both D-Block & S-te-Fan and Da Tweekaz have released the most songs in the past weeks. So I choose to use 2 songs from D-Block & S-te-Fan and one song from Da Tweekaz:

  • Lake Of Fire as comparison song.
  • Fallen Souls as base-line song for D-Block & S-te-Fan.
  • Knockout as base-line song for Da Tweekaz.

Then I created a Dynamic Time Warp plot between the comparison and one base-line song to look if we can see some similarities between both songs from D-Block & S-te-Fan and big difference between de comparison song and the base-line song of Da Tweekaz.

The plots can be found below:

In my opinion the DTW between the two D-Block & S-te-Fan songs looks a lot more symmetrical, which would indicate that both songs both follow a similar pattern. This while in DTW between the D-Block & S-te-Fan song and the Da Tweekaz song both songs are very destinguishable, which would indicate that they are very dissimilar from each other.

We now need to find a way to convert this dissimilarity into a way to link an artist with a song. So I made cepstrograms and self-similarity matrices for these novel songs and compare these in the next two stories.

Cepstrograms of the songs


Although the cepstrograms look very nice, it does not give a direct clue that Lake of Fire and Fallen Souls are from the same artist. The only thing that can be noticed is that both songs are less loud than Knockout. Sadly we don’t see any patterns that are both in Lake of Fire and Fallen Souls and not in Knockout. There is however one interesting thing about both the cepstrogram and self-similarity matrices of the three songs, but I will explain that in the Self-Similarity Matrix section.

Self-Similarity Matrix of the songs


Looking at the Self-Similarity matrices we do see very different plots. This, however, is again not what I wanted to see, since I would’ve hoped for shared features between the first two matrices. This was ofcourse to be expected since both the cepstrograms and similarity matrices are made with the same data/features of the three songs.

Since the similarity matrix of Fallen Souls looked so interesting because of it having only one yellow horizontal (and thus vertical) line. When listening to all three songs if found out that the yellow parts in both the cepstrograms and similarity matrices of Lake of Fire and Knockout corresponded to a ‘drop’ or at least and increase in overall music intensity and especially an increase in intensity of the bass frequencies.

The interesting bit is that the yellow part in Fallen Souls actually corresponds with the sound of a piano! This piano is supported with a high sound in the background at about the same height resulting in an high magnitude at c02. It is also worth noticing that after this bit there is an increase in magnitude at the c05 level. This is because after both high magnitude bits at c02 there is a part where a female voice sings: “In a place afar from home, there is a haunted house along an empty road, and if you listen close you can hear the whispers of the fallen souls” before going to a lower singing height. The dark blue squares correspond to the sound of only a synth sounding the main melody of the song, explaining the relative small magnitude.

Upcoming

All items below are a planning and description of things I’m going to do in the upcoming weeks, these are subject to change as result of feedback and/or new things learned during the course.

Modeling
This is where the actual models are created, and we finally see what they are capable of!

As mentioned before I’m going to train multiple models, trained on different subsets of the initial dataset:

  • A model trained on all features mentioned in the Data Understanding without the Mode feature.
  • A model trained on (a subset of) the Principal Components.
  • (Optionally) A model trained on (a subset of) the get_track_audio_analysis() features, if necessary.

Depending on the results of these models I may deduce what is causing them to perform a certain way, and build new models that may perform better.

Evaluation
Here I evaluate the performance of each model and explore the reason of their performance.

For each model we can assess it’s performance by the amount of songs it classified correctly. We can use an extensive confusion matrix to compare the amount of correctly classified songs to the amount of incorrect classifications to see which songs are related to each other. After all, if the songs for a certain artist that were incorrectly classified almost always were classified with a certain other artist, those artists must be very similar to each other in terms of the features used for the model.

Optimally we want to create a model that uses only the features returned by the get_track_audio_features() method, without diving into the sounds and tune of a song (the data returned by the get_track_audio_analysis() method). I’m going to compare the different models with each other and find which model performed best and why. If two artists turn out to be very similar in each model, I may look for the most similar tracks and subjectively compare them by for example listening to them to see whether those tracks are indeed similar to the ear as well.

For the future I hope that some of the artists used to build the model will release new songs, thus allowing me to test the model on novel data to see whether the model is indeed as good as it proved to be on the testdata.